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Abstract 

We study correlation bounds under pairwise independent distributions 
for functions with no large Fourier coefficients. Functions in which all 
Fourier coefficients are bounded by 5 are called S-uniform. The search for 
such bounds is motivated by their potential applicability to hardness of 
approximation, derandomization, and additive combinatorics. 

In our main result we show that E[fi(Xl, X?) . . . f k (Xi, . . . , X%)] 
is close to under the following assumptions: 

• The vectors {{X[, . . . ,X 3 k ) : 1 < j < n} are i.i.d, and for each j the 
vector (Xl, . . . ,X 3 k ) has a pairwise independent distribution. 

• The functions /; are uniform. 

• The functions fi are of low degree. 

We compare our result with recent results by the second author for low 
influence functions and to recent results in additive combinatorics using 
the Gowers norm. Our proofs extend some techniques from the theory of 
hypercontractivity to a multilinear setup. 
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1 Introduction 



1.1 Functionals of Pairwise Independent Distributions 

In recent years there has been an extensive study of conditions satisfied by 
functions fx , . . . , which guarantee that 

k 

E[h(Xi) ■ ■ ■ fk(x k )] « UmiXi)}, (i) 

t=l 

for certain probability distributions over (Xi, ...,Xk) that are pairwise inde- 
pendent. Recall that the random vector (Xx, . . . ,Xk) is pairwise independent 
if for all 1 < i < j < k the random variables Xi and Xj are independent. 
In the current paper we will consider this problem under the additional as- 
sumption that for all 1 < i < k the random variable Xi is an n dimensional 
vector X, = {X},..., X?) e fl n and that {X{,..., X 3 k ) follow the same (pair- 
wise independent) distribution fj, over Q k , independently for each 1 < j < n (see 
Figure [T]). We further assume that is a finite probability space. 
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... x{ . 


Xi \ 
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... x{ . 
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... xi . 
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Figure 1: The random matrix X. The columns X 1 ,..., X n are 
i.i.d. random vectors, and the distribution of the column X^ = 
(Xf, . . . ,X J k ) T is pairwise independent, for each j S [n]. 



The basic example of a condition implying UJ is that given in the proof of 
Roth's Theorem [16]. This argument yields that 



E 



TIM**) 



.1=1 



U j 



i=l 



<<5 J , 5 := max Jl/illoo. 

1<Z<3 



(2) 



where 

• (Xi,X 2 ,X 3 ) are pairwise independent. 

• fi,h,h are any functions with maxi<i< 3 ||/i|| 2 < 1 and /i, f 2 , /3 are 
their Fourier transforms. 

Roth's original argument considered (Xi, X 2 , X3) which is a uniformly chosen 
3-term arithmetic progression in Z p but the argument extends immediately to 
the setup considered here. 
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Gowers [5] generalized |(2]) and showed that: 



E 



l<i<k 



i\\U k 



(3) 



where 



,Xk) is a uniformly chosen fc-term arithmetic progression in Z™. 



• The functions fi are all bounded by 1. 

• 1 1/| | u d is the d'th Gowers norm of / (see Definition I2.7P 

Note that the uniform distribution over arithmetic progressions Xi , . . . , Xk 
of length 3 < k < p defines a pairwise independent distribution in (Z™) fc . See 
also [6] and [I] where more general results are obtained for other pairwise inde- 
pendent distributions which are defined by linear equations. 

Apart for the additive context, expressions of the form Yii=i fi(Xi) often 
appear in the study of hardness of approximation in computer science. In this 
context, a natural condition is that the functions /i, • ■ • ,/fe all have low influ- 
ences. For example, recent results of Samorodnitsky and Trevisan [T7] show 
how to utilize the Gowers norms in order to show that (here, Inij(fi) is the 
influence of X\ on fi, see e.g. [T7] for the exact definition): 



E 



HfiiX, 



< o 



I max max Inf 

l<i<2 fc l<j<n 



(4) 



provided that: 



• X\ , . . . , Xk are the elements of a uniformly chosen fc dimensional subspace 
ofZ?? 

• The functions fi are all bounded by 1. 

This in turn allowed the authors to obtain computational inapproximabil- 
ity results for certain constraint satisfaction problems, assuming the so-called 
Unique Games Conjecture [7]. The results of [T7| include a more general state- 
ment which applies in any product group. 

A more recent result of the second author [ll] (see also [10]) derive a bound 
similar to (0]) by showing: 



E 



where ^(e) 



,i=l J i=l 

0(log(i/e)/6 ) p rov ided that: 



< ^ max max Inf„ (/j) 

1 l<i<fc l<j<n J 



(5) 



, X° k ) is any pairwise independent distribu- 



The distribution fi of (X[ , ... ,^ hl 
tion which is connected. This means that for every x, y in the support 
of the distribution there exists a path from x to y in the support that is 
obtained by flipping one coordinate at a time. 



The functions fi are all bounded by 1. 
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The proof of ([5]) is based on showing that for that if all functions fi are of degree 
at most d then: 



E 



flMXi) 



< C d ( max max Infj(/ 4 ) ) 



(6) 



for some absolute constant C provided that: 



• The distribution [i of (X\ , . . . , X J k ) is any pairwise independent distribu- 
tion. 

• The functions fi satisfy H/J2 < 1 for all i. 

The bound is then derived from |(6|) by applying certain truncation argu- 
ments. These results of |TTj do not use any algebraic symmetries or the Gowers 
norm. Rather, they were based on extending Lindeberg's proof of the CLT [9] 
using invariance and generalizing recent work [151 [12] . 

We note that results of [TT] later implied results by the authors of this 
paper [TJ which gave stronger and more general inapproximability results than 
those obtained in [17]. It was further noted in [TT] that many of the additive 
applications involve pairwise independent distributions. 



1.2 Our Results 

Motivated by these lines of work in additive number theory and hardness of 
approximation we wish to obtain weaker conditions that guarantee ([TJ). Indeed 
our main result Theorem 13.21 shows that 



E 



< c^HAIIooIIll/fl 



(7) 



i=2 



for some constant C which only depends on the pairwise independent distribu- 
tion /i, where 

• || /ill 00 = m ax|/i(cr)| denotes the size of the largest Fourier coefficient of 
fx- 

• (X±, . . . , Xf.) is pairwise independent as in Figure [TJ 

• The functions /, are of Fourier degree at most d. In other words, all of 
their Fourier coefficients at levels above d are 0. 

We also give some basic extensions of this. In particular, Corollary 13.81 shows 
that, in the case when J7]) does not hold, one can find three Fourier coefficients 
/i 2 ( cr 2) and/j 3 (i73) which are all of non-negligible magnitude, and which 
"intersect" in the sense that a\ , 02 and 03 share some variable j S [n] . 

We note that the conditions on the underlying distribution and uniformity 
are very weak while the condition on the Fourier degree of the function is 
very strong. By a simple application of Holder's inequality, we will see in 
Proposition 13.91 that the results extend to functions which are "almost low- 
degree" in the sense that the high-degree parts have small norm. 

Note that our result (7]) is stronger than ([6]) as the bound is stated in terms of 
the largest Fourier coefficient instead of the largest influence (and that it suffices 
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that only one of the functions has small coefficients, as opposed to J6]) where 
all the functions are required to have small influences) . A very natural question 
to ask is whether the degree restriction can be relaxed further. As mentioned 
above the proofs in [Tl] are achieved by first establishing a result for low-degree 
polynomials and then performing a truncation argument. The work presented 
in this paper may be viewed as an important step in establishing similar results 
for a wider family of functions. We elaborate on this issue in Section [5j 

To compare our results with ((5j) , note that §S§ requires the stronger condition 
that all the fa have low influences and that the pairwise independent distribution 
has to be connected. We also note that the conclusion derived in <[5j) is stronger: 
first, it applies to general (not bounded degree) functions and secondly it is 
shown that fx, . . . , fk are, in fact, close to being independent. 

Further, our results should be compared to what is known about the Gowers 
norm and the corresponding pairwise independent relation. Here it is easy to 
see and well known that if a bounded function has large U 2 norm, then it has a 
large Fourier coefficient. However, it is known that such a conclusion does not 
hold for higher degree Gowers norm. 

1.3 Applications 

The applications we present mostly concern uniform functions of low Fourier 
degree. We show that such functions cannot "distinguish" between truly in- 
dependent distributions and pairwise independent product distributions unless 
they have a large coefficient. In particular we show that such functions defined 
over Z™ have low Gowers norm. This implies that for functions of low Fourier 
degree all of the U h norms are equivalent for k > 2. Moreover, such functions 
cannot distinguish the uniform distribution over arithmetic progressions from 
the uniform distributions over the product space. 

1.4 Proof Idea 

The proof of Q is based on induction on the degree and the number of variables. 
In a way it is similar to inductive proofs for deriving hyper-contractive estimates 
for polynomials of random variables, see, e.g., [12]. Naturally the setup is differ- 
ent as each polynomial is applied on different random variables. The pairwise 
independence property is crucial in the proof as it shows that certain second 
order terms vanish. 

1.5 Paper Structure 

In Section [2] we recall some background in Fourier analysis and noise correlation. 
In Section [3] we derive the main result and some corollaries. In Section H] we 
derive some applications of the main result. In Section [5] we discuss potential 
extensions of the main result. 
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2 Preliminaries 



2.1 Notation 

Let f2 be a finite set and let ji be a probability distribution on fl. The following 
notation will be used throughout the paper. 

• (O n , n® n ) denotes the product space Ct x . . . x ft, endowed with the product 
distribution. 

• a(p) = min{ fx(x) : x S O, fi(x) > 0} denotes the minimum non-zero 
probability of any atom in fl under the distribution /z. 

• L 2 (Q,n) denotes the space of functions from SI t o C. We define the in- 
ner product on L 2 (fl,n) by (f,g) = E xe ^ QjlJj - ) [f(x)g(x)] ) and l v norm by 

\\f\\p = i^e(n^[\f\ p ]) 1/p . 

For a probability distribution /i on fix x . . . x fi^ (not necessarily a product 
distribution) and z G [k], we use fa to denote the marginal distribution on fli. 
Such a distribution [i is said to be pairwise independent if for every 1 < i < j < k 
and every a G £1^, b G Clj it holds that Pr 3;e (a 1 x...xsi fc .^)[2 ; i = a A xj = b] = 
ia(a)nj(b). 

2.2 Fourier Decomposition 

In this subsection we recall some background in Fourier analysis that will be 
used in the paper. 

Let q be a positive integer (not necessarily a prime power), and let (fl, /z) be 
a finite probability space with |0| = q, which is non-degenerate in the sense that 
fi(x) > for every x G ft. Let xo, ■ ■ ■ 7X9-1 : ^ ~~ ► C be an orthonormal basis 
for the space L 2 (fl,fi) w.r.t. the scalar product (•,•). Furthermore, we require 
that this basis has the property that Xo = 1) i-e., the function that is identically 
1 on every element of fi. 

We remark that since the choice of basis is essentially arbitrary, one can take 
Xo, ■ ■ ■ , Xq-i to be an K- valued basis rather than a C-valued one (which can be 
desirable in the case when one works exclusively with R- valued functions) . The 
only place in the paper where this distinction makes a difference is the final part 
of Theorem 13.21 where this is stated explicitly. 

In the complex valued case when [i is the uniform distribution we can take 
the standard Fourier basis Xy{ x ) — exp(27ri:ry / q) where we identify Q with Z q 
in some canonical way. 

For a G Z™, define Xa ■ ^" -> C as ® ie[n] X<r t , i-e., 

Xa{xi, ■ ■ .,X n ) = Y[ Xai(Xt). 

i£[n] 

It is well-known and easy to check that the functions {x CT }o-ezj form an or- 
thonormal basis for the product space L 2 (fl n ,[i® n ). Thus, every function / G 
L 2 (f2™, /z®") can be written as 

f(x) = J2 ?(r)x*(x), 
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where / : Z™ — * C is defined by /(<r) = (/, x<r)- The most basic properties 
of / are summarized by Fact 12, H which is an immediate consequence of the 
orthonormality of {Xo-jo-ezj- 

Fact 2.1. We have 

nig] = E E t/] = /(°) Var t/] = E /» 2 - 

We refer to the transform / i— ► / as the Fourier transform, and / as the 
Fourier coefficients of /. We remark that the article "the" is somewhat inappro- 
priate, since the transform and coefficients in general depend on the choice of 
basis {xi}i£i. q - However, we will always be working with some fixed (albeit arbi- 
trary) basis, and hence there should be no ambiguity in referring to the Fourier 
transform as if it were unique. Furthermore, most of the important properties 
of / are actually basis- independent. In particular Definition 12.31 to Fact 12.51 do 
not depend on the choice of Fourier basis. 

Before proceeding, let us introduce some useful notation in relation to the 
Fourier transform. 

Definition 2.2. A multi-index is a vector a e Z™, for some q and n. The 
support of a multi-index a is S(a) = { i : <Ti > 0} C [n]. We extend notation 
defined for S{a) to a in the natural way, and write e.g. \a\ instead of |5(cr)|, 
i G a instead of i € S(a), and so on. 

Definition 2.3. The (Fourier) degree deg(/) of / € L 2 (ft n , /it®") is the infimum 
of all d € Z such that f{o~) — for all a with \o\ > d. 

The degree of / is one of its most important properties. In general, the 
smaller deg(/) is, the more "nicely behaved" / is. When deg(/) < d, we will 
refer to / as a degree-d polynomial in L 2 (Q n , ^®"). 

Definition 2.4. For / : ft" -> C and del, the function f^ d : ft" ->• C is 
defined by 

/- d - E 

We define / <d , f =d , f >d and /- d analogously. 

Another fact which is sometimes useful is the following trivial bound on the 
£oo norm of Xa- (recall that a(fi) is the minimum non-zero probability of any 
atom in /x). 

Fact 2.5. Let (ft™,^®") be a product space with Fourier basis {Xcr}<rezj- Then 
for any a € Z^, 

||x.||oo<a(M)- kl/2 . 
2.3 Noise Correlation 

In this section we introduce the notion of noise correlation. 

Various special cases of noise correlation have been the focus of much work, as 
we discuss below. Informally, the noise correlation between two functions / and 
g measure how much f(x) and g(y) correlate on random inputs x and y which 
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are correlated. We remark that the name "noise correlation" is a slight misnomer 
and that "correlation under noise" would be a more descriptive name — we are 
not looking at how well a random variable correlates with noise, but rather how 
well a collection of random variables correlate with each other in the presence 
of noise. 

Definition 2.6. Let (fl, fj) be a product space with Cl = Clx X . . . X Clk, and let 
fx, . . . ,/fc be functions with fi <E L 2 ((Vli) n , (/ij)®"). The noisy inner product, or 
noise correlation, of fx, . . . , fk with respect to ji is 



(flj2 



Jk) u =E 



As it can take some time to get used to Definition I2.6[ let us write out 
(fx, ■ ■ ■ , /fe)„ more explicitly. Let /, : Ctf — > C be functions on the product space 
n™, and let fj, be some probability distribution on = fii x .. . x f^. Then, 



>/4 = | 



where X is a k x n random matrix such that each column of X is a sample from 
(f2, /i), independently of the other columns, and X{ refers to the zth row of X. 

The notation ...,/*)„ is a new notation for quantities studied before 
in e.g. [TT], its applications [H E] and additive number theory. The focus of 
the current paper is where X 1} ■ ■ ■ ,Xk are pairwise independent though noise 
correlation is of much interest also in cases for non pairwise independent distri- 
butions including in percolation, theoretical computer science and social choice, 

see e.g. (ana nana. 



2.3.1 The Gowers Norm 

An instance of noise correlation which has been the focus of much attention in 
recent years is the Gowers norm, which we will now define. Let p be a prime. 
For a function / : Z™ — > C and a "direction" Y G Z" the "derivative" of / in 

direction Y, Jy : 1% -> C is defined by f Y (X) = f(X + Y)f(X). Repeating, 

we define f Yl ,...,Y d ( x ) = (/^...^-J^PO = n<?c[d]' 
where C denotes the complex conjugation operator. 



Definition 2.7. Let / 

is defined by 



C. The ef th Gowers norm of /, denoted 



2" 
U d 



E[f Yl _ Yd (X)} 



where the expected value is over a random X g Z™ and d random directions 
Yx,...,Y d . 

This norm was introduced by Gowers [3] in a Fourier-analytic proof of Sze- 
meredi's Theorem [T5] and has since been used extensively in additive number 
theory. The Gowers norm can be written as a noise correlation. Indeed, we can 
write 



2 ud =E 



n 9s(x& 



SQ. 



= <50,--- ; 3[rf]) M 



S 



where we define g s : Z£ -► C by g s {X) = C^ +1 f{X), and the collection 
(Xs)sc[d\ °f random variables is defined by X$ = X + J2igs f° r a uniformly 
random X € Z™ and independent uniformly random directions Yi, . . . , € Z™, 

2.3.2 Noise Correlation Under Pairwise Independence 

This paper focuses on noise correlation under pairwise independent distribu- 
tions. The interest in this special case comes from applications in computer 
science and additive number theory. We briefly mention a few of these applica- 
tions. 

• In computer science there is interest in pairwise independent distributions 
in hardness of approximation, in particular those of small support. See [T] 
where the results of |ICjJ |TTJ were used to derive hardness results based on 
pairwise independence. 

• As mentioned above, the Gowers norm and the Gowers inner-product are 
both noise correlations. Note that the collections of vectors {X+J2i e $ : 
SC [d]) is pairwise (in fact 4- wise as long as d > 2) independent. 

• Another noise correlation that is closely related to additive applications is 
obtained by considering arithmetic progressions. For concreteness consider 
again the case where all the functions are of Z™ — > {0, 1} and let fc < p. 
Given fc such functions /i, • • • , /fc we let: 



(fl, fk) 



Y[fi(iX + Y) 



where X, Y are independent and uniformly chosen in Z" (note that iX + Y 
and j X + Y are independent for 1 < i < j < k) . If A is an indicator of a 
set then the number of A;-term progressions in A is in fact: 

P n (A,A,...,A). 



3 Main Theorem 

In this section, we state and prove our main theorem. First we define the 
parameter which controls how good bounds we get. 

Definition 3.1. Let fi,...,fk be a collection of functions. We denote by 
deg_ 2 (/i, ■ ■ ■ , fk) the sum of the fc — 2 smallest degrees of /1, . . . , fk- 

We can now state the main theorem. 

Theorem 3.2. Let be a pairwise independent product space Q, = Cli X 

... x fife . There is a constant C depending only on p, such that the following 
holds. 

Let ft, . . . ,/fe be functions fi E L 2 (flf, Ouj)®"). Denote by 6 := max CTeZ? |/i(<r)| 
the size of the largest Fourier coefficient of f\, and let D := deg_ 2 (/i, ■ ■ • , fk) 
denote the sum of the fc — 2 smallest degrees of ft, . . . , fk. Then, 

k 

\(A,...,fkl\<c D s]l\\fA\ 2 , 

i=2 
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Furthermore, one can always take C = ( k 



, where a = rxiirii a(fii) . If 

fi is balanced, i.e., if all marginals fa are uniform, then there is a choice of 
complex Fourier basis such that one can take C = (ky/q — l) 3 . 



We remark that, while Theorem 13.21 is very limited because of its require- 
ment on the degrees of the fi's, the lack of any other assumptions is nice. In 
particular, we do not need to assume that the /i's are bounded, nor do we need 
any assumptions on fi beyond the pairwise independence condition. 

Proof. We prove this by induction over n. If n — 0, the statement is easily 
verified (either D = — oo, or D — 0, depending on whether one of the functions 
is or not)0 

Write fi = gi + hi, where 



ie<7 



i.e., hi is the part of fi which depends on X 1 (the first column of X), and gi is 
the part which does not depend on X 1 . Then 



TC[fe] 



HgiiX^HhiiXi 



For TC[k], define 



E(T) = E 
x 



Ylgi(.Xi)Y[hi(Xi 



The key ingredient will be the following Lemma, bounding |i?(T)|. 
Lemma 3.3. Let C T C [k]. Then: 
• If ' T — , we have 

k 

\E(T)\<C D 5Y[\\ gi \\ 2 . 



i=2 



• // 1 < \T\ < 2, we have 

• If \T\ > 3, we have 



E(T) = 0. 



\E(T)\ < C D+2 



V(g-l)/« 
C 



\T\ 



1 We point out that /, g L 2 (Q ( ?, (fii)® ) does not formally make sense. However in this 
case, the appropriate way to view fi is as an element of L 2 (Qf , (in)®* 1 ) which only depends 
on the n first coordinates, for some large value of N. In particular, for the case n = we have 
that fi is a constant. 
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Before proving the Lemma, let us see how to use it to finish the proof of 
Theorem [ 



Write H^lla = n\\fi\\ 2 for some n £ [0,1], so that ||ft|| a = y/l - t?\ \fi\ | 2 
(by orthogonality of the Fourier decomposition). By plugging in the different 
cases of Lemma [331 we can then bound (/i, . . . , /&) by 

K/i,...,/ fc >j<E^( T )i 



T 



\T\ 



< c^n 11^112+ e g w f ^ i *niNi»niNi 9 

i=2 T|>3 V / i£T iGT 

i#l Ml 

ft 

= c^nn/ 4 n a x 

i=2 

fn^E'? (^I 1 ^) m n VwrL.) « 

\i=2 T|>3 \ / igT i£T / 

Hence, it suffices to bound the "factor" inside the large parenthesis in ([8]) by 1 
in order to complete the proof of Theorem 13.21 

Let t — maxi> 2 r,;. Then the factor in |(8]) can be bounded by 

k 



i=3 



where the in the sum the value of i corresponds to the size of the set T and we 
assumed that C > 1 and then used that, for i > 3, C 2 ~ l < C~ i/3 . To bound {9]), 
we use the following simple lemma: 

Lemma 3.4. For every k > 3, 



i=3 



Proof. Since (.) < k 1 Ji\ we have 

E(J)f*£s*«-'/»sia 

i=3 v 7 i=3 

where the second inequality is by the Taylor expansion e = X)i=o h — £i=o T'- 

□ 

Hence, if C > I fcy 2 ^- I , the factor in ([8]) is bounded by 



y/l -t 2 +r 2 /2 < 1. 



This concludes the proof of Theorem 13.21 We have not yet addressed the claim 
that if the marginals //, are uniform, there is a Fourier basis such that C can be 
chosen as (ky/q — l) 3 . See the comment after the proof of Lemma l3~3l □ 
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We now prove the lemma used in the previous proof. 



Proof of Lemma Iff.ffL The case T = is a direct application of the induction 
hypothesis, since the functions gi depend on at most n—1 variables (and have 
deg_ 2 (gi, . . . ,9k) < D). 
For i£ [k], write 



9-1 



hi(x) = } Yi,j(%l)hij{%2, ■■■■,X n ) 



/ j n.1,3 



for a Fourier basis Xi,o — ■ ■ • >Xi,g-i °f L 2 (fli, fXi). Denoting by X^ the 

jth column of X, and writing ^x 2 ....,x™ f° r the average over X 2 , . . . , X n we can 
write E(T) as 



E(T) = 



E 

a 2 ,.-, a™ 



E 

A 2 ,..., A™ 



Y[g l (x i )E 



HhiXi) 



ff T (X)-J]5i(^i) 



where 



H T (X) = 



E 

A' 1 



LieT 



y e 

^ A 1 



ad[q-l] 



Now for 1 < |T| < 2, the pairwise independence of /i gives that for any a g 



E 

A 1 



Ugt 



hence in this case Ht(X) = and by extension E(T) = 0. 

Thus, only the case |T| > 3 remains. By Holder's inequality, we can bound 



E 

A 1 



< 



im* 



im- 



(10) 



By Fact 12.51 Hv-;^!!^ can be bounded by 

y/l/a(fj,i) < ^Jl/ mina(^) = y/T/c 

Hence we can bound the above by (l/a)' T '/ 2 . 
Plugging this into E(T) gives 



E(T) < (1/a)™ 2 



A 2 ,..., A" 



a£[q-l] T iET i£T 
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For a € [q — 1] T , let D a be the sum of the k — 2 smallest degrees of the 
polynomials {gi : i ^ T} U \hi. ai : i £ T}. Since §i and ft.i i(Ti are functions of 
n — 1 variables, we can use the induction hypothesis to get a bound of 



E(T) < (l/a) 



m/2 



e ^'*niiv«iuniiftii2- 



i0T 



But since the hi^^s have strictly smaller degrees than the corresponding fi's, 
D a is bounded by D — \T\ + 2, and hence we have that 

E{T) < a -\T\/2 C D^T l+ 2 £ *nil^llaIIlWI> 

ae[q-l] 



i^T 
i=tl 



< C D+2 



y/(g-l)A* 
C 



\T\ 



ieT 



i£T 



where we used the fact that J2je[q-i] W^ijlU < \/q — l||ft»||2 (by Cauchy- 
Schwarz and orthogonality of the functions hi,j). 

This concludes the proof of Lemma l3"T3l □ 

Remark 3.5. In the case when the marginal distributions fa are uniform, one 
can take as basis of (f2,/i) the standard Fourier basis Xy( x ) = e 2jrx ~ (where 
we identify the elements x of fl with Z g ). For this basis, Hxjlloc = 1 and hence 
Equation lfT0|) can be bounded by 1 rather than which implies that for 

this basis, we can choose C = (ky/q — l) 3 . 



3.1 Corollaries 

We proceed with some corollaries of Theorem 13.21 The first says that if all non- 
empty Fourier coefficients of /i are small, then the noisy inner product is close 
to the products of expectation. 

Corollary 3.6. Assume the setting of Theorem \3.2l but with \ \fiW2 < 1 for each 
i and 



6:= max max I fj (er) I . 

l<i<k-2 <x#0 Un n 



Then, 



where C and D 



(h 



<5(k-2)C D , 



(11) 



are as in 



Theorem[33 



Proof. We prove the claim by induction on k. The case k = 2 is trivial. For the 
induction hypothesis let gi(x) — fi(x) — E[/i]. Then by Theorem [ 



(h, h) u - nh] (h, fk) u = I (91, h, fk) u I < sc 1 



and by the induction hypothesis 

fe 

E[/i]</2,...,/fe) M -n E ^] =i e [/i]i- 



< (k-3)6C D . 



The proof follows. 



□ 
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A more careful examination of the proof above reveals that in the case where 
(fi, . . . , /fc) deviates from the product of the expected values, there should be 
a basis element with large weight in one of the functions that is correlated with 
some other functions. In particular: 

Corollary 3.7. Assume the setting of Theorem \3.2\ but with D = J2deg(fi) 
the sum of the degrees of all the functions, and \\fiW2 < 1 for each fi. 
Then for all S > if: 



./*>„- II e[a: 



> 2S(k-2)C D , 



(12) 



then there exists an 1 < i < k — 2 and a non-empty multi-index a such that 

\fi(a)\>5, \E[ x ^-h +1 ---f k ]\>S 2 C D 
where C is the constant from Theorem \3.S[ 

Proof. From the previous proof it follows that if Equation lfT2|) holds then there 
exists an 1 < i < k — 2 such that 



i+l) 



>26C D , 



where g l = fi - E[/j]. Write g t = J2aeA9i( a )Xa + h t where A is the set of all 
a for which |<7i(<r)| > 5. Then by Theorem 13.21 it follows that: 



which implies 



E 



|E[fc/i+i •••/*]! <8C L 



9i( a )xl fi+i ■■■ fk 

AaEA ) 



> 6C D . 



Writing 

t(a) = E [xt/i+i ■••/*], 

for a £ A, we see that J2aeA l5i( cr )*( fT )l > SC D . Since Y^aeA liM 17 )! 2 < 1 it 
follows that 

J2\9i(<r)t(<r)\ >5C D Y J \U°)\\ 

aeA oeA 

which implies that there exists a a with 

|E [xUi+i ■■■fk] I = \t(a)\ > SC D \g t (a)\ >S 2 C D . (13) 
The proof follows. □ 



Next we apply the previous corollary to Equation IfTBj) and the functions 
fi+i, ■■■,fk,X % a to obtain that | E[/ J+ i • • • fkXaXi>]\ is large for some j > i and 
a' . Continuing in this manner we obtain the following: 

Corollary 3.8. Assume the setting of Theorem \3.2\ but with D — J2deg(fi) 
the sum of the degrees of all the functions, and \\fiW2 < 1 for each fi. 
Then for all S > if: 

k 

{h,...,fk)^-Y{m] >c D s, (14) 
1=1 

then there exists a set I C [k] with |/| > 3 and for all i £ I a non-zero multi- 
index a(i) such that: 
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• For all i G /: 



\fi{°)\ > 



2k 



• For all a 6 Ui^iS(a(i)) it holds that 

\{i:ae S(a(i))}\ > 3 

(the 3 above may be replaced by r+1 if the distributions involved are r-wise 
independent). 

Proof. Define So = 5 1 ' 2 , and 5i = -gi 3 -. We show by induction on a that it is 
possible to find /,JC [k] disjoint where / is of size at least a and for all i £ / 
there exists a non-zero multi-index a(i) such that for all i £ I: 

2" 



and further 



\fMi))\ > S a 



E 



> 



(2fc) 2a -! 

> C D S a+1 . 



(15) 



(16) 



iei je.J 

The base case a = 1 was established in the previous claim. The induction step 
is proved by noting that if J is non-empty and j £ J, then we may apply the 
previous claim to the sequence of functions fj ,j € J followed by the functions 
X l (<j(i)). We then obtain (fT5|) and lfT6|) with <5 a +i and sets I' and J' where J' 
is of size one smaller than J. When we stop with J — and a < k we obtain 
that J is empty and therefore: 



E 



n 

Lie! 



(T(i) 



> 6^4+1 > 0. 



This together with pairwise independence implies that For all a € Ui e iS(a(i)) 
it holds that 

|{i : a E S(a(i))}\ > 3 
as needed. □ 

We finally note while all of the results above are stated for low-degree poly- 
nomials, they also apply for polynomials that are almost low-degree. Indeed 
Holder's inequality implies the following. 

Proposition 3.9. Assume the setting of Theorem \3.2\ and with k functions 
satisfying \ \fi\\k < 1 and ||/j >rf ||fc < e for all i. Then 



(fi, fk)^ - (/i <rf ) ■ ■ ■ ; /if d ) 



< ke(l + e) 



k-l 



Proof. The proof follows by using Holder's inequality k times, each time replac- 
ing ^ with ff d . Note that < \\f t \\ k + ||/> d || fe < 1 + e, so that when 
making the i'th replacement, the error incurred is bounded by 

i-l k 

nil/f*IUII/^ll* II < (1 + e) i - 1 e. 



j=i+i 



□ 
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4 Applications 

The first application is a "weak inverse theorem" for the Gowers norm. From 
Theorem 13.21 and the fact that 

11/11^ = £i/V)i> 

we immediately obtain that 

Proposition 4.1. Let f : Z™ — » C have Fourier degree d, have \\fW2 — 1 and 
let k > 2. If the k'th Gowers norm of f satisfies \\f\\u k > e > then there exists a 
multi-index a £Z™ such that 

where the Fourier coefficient is w.r.t. the standard Fourier basis. In particular, 

mu ' 2 ~ ( (2Vg-lJ Jd ) ' 

This implies that for functions of low Fourier degree, all U norms for con- 
stant k > 2 are equivalent. We next obtain a similar result for arithmetic 
progressions using Theorem 13.21 and Corollary 13.81 

Proposition 4.2. Let (X\,...,Xk) have the uniform distribution over arith- 
metic progressions of length k in where 3 < k < p. Let Y%, . . . ,Yk be i.i.d. 
and uniformly distributed in Z™ . Let fx , . . . , : Z™ — > C have Fourier degree d 
and || 2 < 1 /or all i. Then, if 

I Hh{Xi) ■ ■ ■ fk(X k )} - nhiY,) ■ ■ ■ f k (Y k )}\ > e, 

it holds w.r.t. the standard Fourier basis that: 

1. None of the functions fi are S -uniform with 



{k^/d~=Tf dk ' 

2. There exist indices 1 < < i(2) < i(3) < k and multi-indices 
£7(1), a(2), <r(3) G 7%, <r(l) fl cr(2) H cr(3) ^ 0, 

SMcft that 

\M;Mj))\>[ k . {kV ^j rdk ) 

for 1 < j < 3. 

We note that the two results above may be interpreted as certain types of 
derandomization results which can be defined in further generality. The basic 
setup is that there are 2k vectors Xi, . . . , X k and Y\ , . . . , Y k . All of the vectors 
have the same distribution which is uniform in some product space fl n . However, 
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the YiS are independent while the X^s are only pairwise independent. How can 
the two distributions be distinguished? One wav to distinguish is to consider 
functions fi of Xj (resp. Yi) and to show that Yl%=x /*C^i) is far in expectation 
from n*=i ® UT resu lts show that if the functions fi are uniform and of 

low degree then it is impossible to have such a distinguisher. 

We finally note that for all the applications considered here, the results hold 
assuming the function is close in the fc'th norm to function of low degree by 
Proposition 13.91 

5 Possible Extensions 

We briefly discuss some comments regarding possible extensions of the main 
result. 

5.1 Invar iance 

The result of [10] show under stronger conditions the invariance of the func- 
tions fx, . . . , /fc. In other words: they show that the distribution of (fx, ■ ■ ■ , fk) 
under the pairwise distribution is close to the distribution under the product 
distribution with the same marginals as [i. 

One would not expect that such a strong conclusion will hold here. Con- 
sider for instance the following example. Let / : {—1,1}" — > M be defined 
by f( x ) — ( x i ~ l)( a; 2 + ••• + Xn)/™ 1 / 2 . Then / has Fourier degree 2, vari- 
ance 0(1), and coefficients of order n -1 / 2 . Define a distribution /i on triples 
of strings (x,y, z) 6 ({ — 1,1}™) 3 , by letting, for each i € [n], the distribution 
on the z'th coordinate be the uniform distribution over (xi,yi,Zi) satisfying 
x% ■ yi ■ Zi = 1. Then fj, is balanced pairwise independent. Now consider the dis- 
tribution of (f(x),f(y), f(z)), compared to the distribution of (f(x), f(y), f{z)) 
for x, y and z independent uniformly random strings of { — 1, 1}™. The distribu- 
tion of (f(x), f(y), f{z)) is supported only on points where at least one of the 
coordinates is (since one of Xx, yx, zx is always 1). On the other hand, the 
distribution of (f(x),f(y),f(z)) has an fi(l) fraction of its support on points 
such that all three of |/(i)|, \ f{y)\, and \ f(z)\ are lower bounded by £1(1). Hence 
the two distributions are not close, even though the Fourier coefficients of / can 
be made arbitrarily small by increasing n. 

The same reasoning shows that we can not hope for invariance even if 
all moments on up to k — 1 variables match. E.g., even if Xx, ■ ■ ■ , Xu are 
(k — l)-wise independent it is not necessarily the case that the distribution 
of (f(Xx), . . . , f(Xk j) is close to a product distribution. 

5.2 Relaxed Degree Conditions 

As mentioned before, previous work |12| [TT] established results of the type 
discussed here by first deriving the results for low degree polynomials and then 
applying "truncation arguments" to obtain results for general bounded functions. 
It seems that in the context of the current paper these truncation arguments 
are more challenging. 

Indeed, it is well-known that in general, large Gowers norm does not imply 
large Fourier coefficients (consider e.g. the function f(X) = (—1)^=1 x i x i+i 
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over Zj), and hence one can not hope to drop the requirement of small Fourier 
degree and generalize our theorem to general bounded functions. 

However, improvements are still possible. First, it is possible that under 
additional conditions on the pairwise independent marginal distributions, the 
requirement on low Fourier degree can be dropped completely. We discuss this 
below. 

A second, closely related possible improvement, is to slightly relax the strong 
Fourier degree requirements. In particular, one can hope that a similar bound 
can be derived for functions with exponentially small Fourier tails, i.e., functions 
/ such that the total Fourier mass on the high-degree part decays exponentially, 
||/ >d ||| < (1 — 7) for some 7 > 0. Such functions arise naturally in many 
applications, e.g., when functions are evaluated on slightly noisy inputs. Hence, 
it is natural to ask whether the following extension of our result can be true: 

Question 5.1. Let (fi, /i) be a pairwise independent product space fi = fix x 
. .. x fife. Is it true that for every 7 > and e > 0, there exists a constant 
S := 5(7, e) > such that the following holds? If /1, . . . , fk are functions fi € 
L 2 (n?,(/n)®") satisfying 

• For every i G [k], ||/»||oo < 1- 

• For every d £ [n], \\f} d \\ 2 2 < (1 - ~/) d . 

• For every a E Z™, |/i(er)| < 5. 
Then 

{fl, • • ■ , fk)^ < £• 

An affirmative answer to Question 15.11 would also have consequences for 
completely dropping the degree requirement under additional conditions on the 
marginal distributions. 

In particular, for marginal distributions whose support is connected in the 
sense described in Section ll.l[ by [ll] it is known that applying a small amount 
of noise to each of the functions fi, ■ ■ ■ , fk does not change . . . , /&)„ by 
much. 

Since applying noise gives exponentially decaying Fourier tails, an affirmative 
answer to Question 15.11 implies that for connected marginal distributions, the 
condition on the Fourier degree of the functions can be dropped completely. 

The statement of Question 15.11 allows for much weaker bounds on the er- 
ror e than we had in Theorem 13.21 where the error bound was of the form 
X(d, 5) ■ Ili=2 \\fi\U (where X(d,8) = SC d ). One can not hope for such a strong 
error bound in the setting of Question 15.11 (with X(d, S) replaced by some func- 
tion A(7, 5) depending on the rate of decay of the Fourier tails, rather than 
the degree), as illustrated by the following example communicated to us by 
Hamed Hatami, Shachar Lovett, Alex Samorodnitsky and Julia Wolf: consider 
a pairwise independent distribution \i on {0, l} fc in which the first w log A; bits 
are chosen uniformly at random, and the remaining bits are sums of different 
subsets of the first logfc bits. This distribution is not connected in the sense 
described above, but that can easily be arranged by adding a small amount of 
noise to fi, which will not have any significant impact on the calculations which 
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follow. Let / : {0, 1}" — > {0, 1} be the function which returns 1 on the all-zeros 
string, and otherwise. Then, one has that 

(/,..., /) M - Pr[*! = . . . = X k = 0] « 2 -" lo s fc , 

whereas ||/|| 2 = 2~™/ 2 and hence the product rij=2 I l/lb equals 2 _n ( fe_1 )/ 2 so 
that 

fc 

A(7, *) • II I l/l b = A(7, ( 5)2-"( fe - 1 )/ 2 «(/,..., fl . 

i=2 

One may argue that it is more reasonable to bound {fx,... , /&) in terms of 
e.g. the £k norms of the fa's rather than the £2 norms. We do not know of any 
counterexample to such a strengthening of Question 15.11 
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